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ABSTRACT 



A program transformation method for transforming a source 
program described by a programming language into an 
object program described by a language executable by a data 
processing system, includes a process of transforming at 
least a part of procedure, function or sub-routine used in the 
source program into a form so that the object program can 
be stored in an arbitrary storage region of a primary storage 
device of the data processing system, a process of arranging 
procedure, function or sub-routine transformed or not trans- 
formed in the first process in the storage region correspond- 
ing to cache line of a cache memory among storage region 
of the primary storage device without causing cache conflict 
on the basis of information relating to the procedure, func- 
tion or sub-routine obtained during a process of transforma- 
tion of the source program into the object program, and a 
process of generating the object program, on the basis of the 
result of arrangement. 

23 Claims, 16 Drawing Sheets 
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FIG. 2 

f 



extern int count = 0 ; 

void func () 

{ 

fund 0; 
func2 0; 
fund (); 

} 

void fund () 
{ 

count += 1 ; 

} 

void func2 () 
{ 

count += 2 ; 

} 
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FIG. 8 
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FIG. 10 




FIG. 18 (P™°R A'") 

void func (void) 
{ 

func_A 0 ; 
func^B 0 ; 
func_A 0 ; 
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FIG. 16 (PRIOR ART) 
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1 2 

PROGRAiM TRANSFORuiATION METHOD In a program, it can become necessary to call oUier 

AND PROGRAM TRANSFORMATION procedure (hereinafter referred to as "callee" side procedure) 

SYSTEM in execution of some procedure (hereinafter referred to as 

"caller*' side procedure) at a certain portion of the program, 

BACKGROUND OF THE INVENTION 5 Therefore, when the source program is transformed into the 

1 Field of the Invention object program and the resultant object program is stored in 

the primary storage device, if an instruction code of the 

TTie present invention relates generaUy to a program ^^i,, ^ide procedure and an instruction code of the caller 

transformation method, a program transformation system procedure closely related to the former are physicaUy 

and a storage medium stonng a program transfonnaUon arranged close with each other, a procedure call instruction 

program. More particularly, the invention relates to a pro- ^^ ^i^^^^ f^om that for long jump to that for short 

gram transformation method and a program transformation jump 

system for transforming (compiling) a source pcogram .u- j - c.t^ ml*. l 

/ M. J . - 1 • . L^l* By this, the code size of the overall object program can be 

described by a programming language mto an objett pro- _. . t • ^- -.l j 

J -tl J u 1 / u* 1 - reduced. In conjunction therewith, an execution speed upon 

gram described by a language (machme language, assembly * * • *u u- * • !u . *u Ann 
1 A f -*u\ * Vi u & » ' -^15 executmg the object program m the computer or the CPU 

language and so forth) executable by a computer, a central i_ if- u * ■ r ■ . j u • 

■ •* /i^mn J *u 1-1 can be higher. Arranging of the instruction codes having 

processing unit (CPU) and the like. J% * tf n * ^ • * 

o \ / jjjgj, possibility to be sequentially executed m time at 

2. Description of the Related Art physically close positions on the object program is called as 

FIG. 15 is a block diagram showing an example of a arrangement optimization of the instruction codes of the 

construction of the conventional program transformation 20 procedures. 

system disclosed in Japanese Unexamined Patent Publica- ^ext, the program executing portion 156 reads out a 

tion No. Heisei 1-118931. procedure call frequency parsing program from the fourth 

The program transformation system illustrated in FIG. 15 program storage portion 157 and executes the same, 

is constructed with a first program storage portion 151, a Namely, the program executing portion 156 reads out the 

compiler 152, a second program storage portion 153, a third 25 temporary object program from the second program storage 

program storage portion 154, an input data storage portion portion 153. In conjunction therewith, an input data stored in 

155, a program executing portion 156, a fourth program the input data storage portion 155 input by an operator is 

storage portion 157 and a parsing result storage portion 158. read out by the program executing portion 156. Then, the 

At first, the compiler 152 reads out a source program program executing portion 156 simulates execution of the 

described by a programming language, such as C language temporary object program and, in conjunction therewith, 

and so forth from the first program storage portion 152, integrates number of times of occurrence of call of other 

temporarily generates an object program described by a Procedures in a certain procedure in the temporary object 

machine language, an assembly language and so forth, and P^^S^^'"- ^ result of integration is stored in the parsing result 

. .. J , . . , storage portion 15a as a procedure reference trequency 

stores the temporarily generated object program m the 35 p^j.5j°g result 

second program storage portion 153. ^^^^^ compiler 152 reads out the procedure refer- 
Here, the temporarily generated object program is the ence frequency parsing result from the parsing result storage 
program generated by transforming the source program into portion 158 to calculates closeness of reference relationship 
codes of machine language, assembly language or so forth between arbitrary two procedures. On the basis of a resultant 
in a sequential order of description. While the temporarily closeness, arrangement optimization of the instruction code 
generated object program is executable by the computer, the is performed to generate the final objective program to store 
central processing unit (CPU) and so forth, since the source ^ ^^^^ program storage portion 154. 
program is simply transformed into the codes in a sequential On the other hand, FIG. 16 is a block diagram showing an 
order of that in the source program, it inherently has 45 example of a construction of the conventional program 
redundam portions to make the size (code size) of the overaU transformation system disclosed in Japanese Unexamined 

, ^ , uii - . 1 ,j Patent Publication No. Heisei 9-34725. 

object program large as held in the temporarily generated ^ - . - ^ , 

^ ™ - , , ./ . - J . The program transformation system illustrated in FIG. 16 

form. Therefore, a large storage capacity is required ma . f * j . 

j- L-L- J / . IS constructed with a source program storage portion 161, a 

primary storage device which is adapted to store the tern- ^^^^^^ and an object program storage portion 163, in 

porarily generated object program. Furthermore, an execu- general. 

tion period of the object program becomes long to lower compiler 162 is generally constructed with a parsing 

efficiency. portion 164, a procedure call occurrence counting portion 

Therefore, it becomes necessary to generate an eflScient 165, a code generating portion 166, a procedure call count 

and optimal object program. The object program simply 55 data storage portion 167, a special space arranged procedure 

transformed into the codes from the source program in a determining portion 168, an object program outputting por- 

sequential order described in the source program in the tion 169. Here, a special space means a special region of a 

process set forth above, will be hereinafter referred to as finite code size set in a part of a program space, 

"temporary object program" distinguishing from an opti- The parsing portion 164 reads out the source program to 

mized final object program. go be parsed from the source program storage portion 161 and 

There are various method for optimizing the object pro- parses a syntax forming the source program. The procedure 
gram. Here, arrangement optimization of instruction codes call count portion 165 counts number of times of call of 
in a procedure. The procedure means a group of processes, respective procedure per procedure recognized by the pars- 
such as arithmetic operation, to be executed by the computer ing portion 164 upon parsing the syntax, 
or CPU and is often called as function or sub-routine. 65 The code generating portion 166 performs code genera- 
Throughout the disclosure and claims, the group of pro- tion twice. Namely, at first code generation, the code gen- 
cesses will be generally referred to as "procedure". crating portion 166 generates a normal code if the syntax is 
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not the procedure call instruction, and generates an instruc- 
tion code using normal call instruction if the syntax is the 
procedure call instruction, on the basis of the result of 
parsing of the parsing portion 164. On the other hand, the 
code generating portion 166 scans the results of code 
generation in the first time from the leading end in the 
second code generation. Then, if the code is the procedure 
call instruction and, a result of inquiring to the special space 
arranged procedure determining portion 168 shows that the 
procedure is a special space arranged procedure determined 
to be arranged within the special space, a normal call 
instruction code having large byte count is replaced with a 
dedicated call instruction code having smaller byte count. 

The procedure call count data storage portion 167 stores 
a call count counted by the procedure call occurrence 
counting portion 165 per procedure and a code size of the 
code generated in the first code generation. The special 
space arranged procedure determining portion 168 selects 
and determines a procedure to be arranged within the special 
space with providing preference for the procedm"e having 
greater call count so that a sum of the code sizes of the 
procedures to be arranged within the special space falls 
within a code size of the special space on the basis of call 
count and code side per procedure stored in the procedure 
call count data storage portion 167. 

The object program output portion 169 outputs the code 
to a segment added an arrangement attribute to the special 
space when a result of inquiry to the special space arranged 
procedure determining portion 168 shows the code gener- 
ated by the code generating portion 166 is the code of a 
definition portion of the special space arranged procedure, 
and when the code generated by the code generating portion 
166 is not the code of the definition portion of the special 
space arranged procedure, a normal segment is output. Here, 
the segment means a group of codes as minimum unit of 
arrangement when the code is arranged within the program 
space. 

As set forth above, the object program output portion 166 
separates the special space arranged procedures and the 
normal procedures. Next, the object program output portion 
169 outputs data of parameter region or so forth, outputs a 
code portion and a data portion in combination as object 
program, and stores in an object program storage portion 
163. 

With the construction set forth above, the code size of the 
generated object program can be reduced. Associating with 
this, the program space can be saved. Also, the execution 
speed upon execution of the object program by the computer 
or the CPU can be higher. 

On the other hand, in the conventional program transfor- 
mation system disclosed in Japanese Unexamined Patent 
Publication No. Heisei 1-118931, since an object to perform 
arrangement optimization of the instruction code of the 
procedure is only procedure which the user defines in the 
source program, improvement of efliciency of the object 
program is limited. 

On the other hand, in the conventional program transfor- 
mation system disclosed in Japanese Unexamined Patent 
Publication No. Heisei 9-34725, since the special space has 
a finite code size, the procedures to be arranged within the 
special space are limited. TTierefore, improvement of effi- 
ciency of the object program is limited. 

On the other hand, when the object program generated by 
the program transformation system is to be executed by a 
one-chip microcomputer consisted of CPU, decoder and so 
forth, the object program is stored in the external primary 
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Storage device so that each code of the object program is 
read out sequentially from the primary storage device. Then, 
after decoding by the decoder, the CPU parses the object 
program for execution. In this case, in order to speed-up the 

5 execution speed of the CPU, a cache memory for tempo- 
rarily having small storage capacity and high access speed 
and storing the codes read out from the primary storage 
memory which normally has large storage capacity and low 
access speed, is provided in the one -chip microcomputer. 

30 In the one-chip microcomputer provided with the cache 
memory, when the CPU executes the code, each code in the 
object program read out fi-om the primary storage device 
cannot be decoded by the decoder and parsed and executed 
by the CPU until it is once stored in the cache memory. In 

15 the one-chip microcomputer of this kind, there are various 
method to store each code read out from the primary storage 
device in the cache memory. Amongst, a direct map method 
is one of the method for storing each code in the cache 
memory. 

As shown in FIG. 17, in the direct map method, a cache 
memory 171 is divided into a plurahty of storage regions 
(hereinafter referred to as cache lines). In conjunction 
therewith, each storage region of the primary storage device 
172 is also divided. Each storage region of the primary 
storage device 172 is established correspondence to each 
cache line of the cache memory 171, 

In FIG. 17, the cache memory 171 is consisted of five 
cache lines 171a to 171e Corresponding to these, the 
primary storage device 172 is divided into storage regions 
each having the same storage capacity to that of each cache 
line. Each storage region is corresponded to respective five 
cache fines 171a to 171e with taking five as a unit. Namely, 
storage regions 172- la to 172-first embodiment of the 
primary storage device 172 are corresponded to the cache 
lines 171a to llle as a group. Similarly, the storage regions 
172-2a to 172-second embodiment are corresponded to the 
cache lines 171a to 171e Final storage regions 172 -n a to 
172-ne (n is natural number) are also corresponded to the 
cache lines 171a to 171e 

When the object program to be executed by one-chip 
microcomputer employing the direct map method is to be 
generated using the program transformation system, the 
following drawbacks should be encountered, 

45 For example, when the source program described by C 
language shown in FIG. 18 is transformed into the object 
program by the program transformation system, as shown in 
FIG. 19, respective instruction codes of procedure fiinc_A 
and func_3 are stored in the primary storage device 172. 

50 In FIG. 19, the instruction code of the procedure func_A 
is stored in the storage regions 172 -la to 172-lc of the 
primary storage device 172. Also, the instruction code of the 
procedure func„B is stored in the storage regions 
172-2a __to 172-26 of the primary storage device 172. 

55 Accordingly, the instruction code of the procedure func_A 
is corresponded to the cache memory line 171a to 171c of 
the cache memory 171. On the other hand, the instruction 
code of the procedure ftinc_B is corresponded to the cache 
line 171a and I71b of the cache memory 171. 

60 In such case, when the CPU executes the object program 
generated by transformation of the source program shown in 
FIG, 18, the instruction code of the procedure fiinc_A is 
read out from the storage regions 172 -la to 172-lc of the 
primary storage device 172 and is once stored in the cache 

65 lines 171a to 171c in the cache memory 171, and thereafter 
decoded by the decoder and parsed and executed by the 
CPU. 



04/22/2004, EAST Version: 1.4.1 



us 6,282,707 Bl 

5 6 

Next, the instruction code of the procedure func_B is On the other hand, the procedure to be frequently used in 

read out from the storage regions 172-2a to 112'2b of the execution of the object program cannot speed up the execu- 

primary storage device 172, and temporarily stored in the tion speed of the CPU by reading the instruction code from 

cache lines 171a to 111b of the cache memory 171. Here, the primary storage device 172 and storing in the corre- 
while a part of the instruction codes of the procedtire ^ sponding cache lines of the cache memory every time of use, 

func_A has already stored in the cache lines 171a and 1716 since the instruction code is not stored in the cache memory 

of the cache memory 171, the instruction code of the 171, the instruction code cannot be read out for conflict 

procedure func _B is stored there over (overwritten). (these are generally called as cache miss). Therefore, it 

Therefore, a part of the instruction code of the procedure becomes necessary to store the frequently used procedure in 

func_A cannot be read subsequently. Thereafter, the instruc- the cache memory 171 as long as possible without causing 

tion code of the procedure funcj stored in the cache lines conflict, 

171fl to 171i) of the cache memory b 171 are decoded by the However, in the conventional program iransfonnation 

decoder and parsed and executed by the CPU. systems disclosed in Japanese Unexamined Patent Publica- 

Next, by the source program shown in FIG. 18, the tio" No- Heisei 1-118931 and Japanese Unexamined Patent 

instruction code of the procedure func_A has to be executed Publication No. Heisei 9-34725, nothing is considered with 

again. However, since the instruction code of the procedure respect to the cache miss in execution of the object. 

func_B is already stored in the cache lines 171fl and 1716 Accordingly, even in this point, execution speed of the CPU 

of the cache memory 171, a part of the instruction code of cannot be speed-up. 

the procedure func^ cannot be read out. Therefore the SUMMARY OF TOE INVENTION 

mstruction code of the procedure func_A is again read out 

from the storage regions 172- la to 172-16 of the primary It is an object of the present invention to provide a 

storage device 172. Hien, the instruction code of the pro- program transformation method and a program transforma- 

cedure func__A is temporarily stored in the cache Unes 111a tion system which can successfully prevent conflict of 

to 1716 of the cache memory 171, decoded by the decoder variation procedures on the cache memory, can prevent 

and parsed and executed by the CPU. cache miss of the frequently used procedure and whereby 

As set forth above, when the instruction codes of two ^an speed up execution of an object program by a computer, 

procedures which have high possibility to be executed ^ °^ 

sequentially in time, are stored in the storage regions of the 30 According to the first aspect of the invention, a program 
primary storage device 172 corresponding to the same cache transformation method for transforming a source program 
lines of the cache memory (this will be referred to as being described by a programming language into an object pro- 
loaded on the same cache line), all or a part of the instruction gram described by a language executable by a data process- 
codes stored in the cache memory 171 read our from the ing system, comprises 

primary storage device preliminarily, is overwritten by the 3^ first process of transforming at least a part of procedure, 

subsequently written instruction codes of the procedure read function or sub-routine used in the source program into 

out from the primary storage device and written on the same a form so that the object program can be stored in an 

cache line of the cache memory 171. Such condition is arbitrary storage region of a primary storage device of 

referred to as conflict (cache conflict). If such conflict is the data processing system, 

caused frequently, effect of the cache memory for speeding 40 second process of arranging procedure, function or sub- 
up execution speed of the CPU can be negated. More routine transformed or not transformed in the first 
worsely, it is possible to cause slow down of the execution process in the storage region corresponding to cache 
speed of the CPU. line of a cache memory among storage region of the 
As methods for storing the cache memory of each code primary storage device without causing cache conflict 
read out firom the primary storage device, there are a fiiUy 45 on the basis of information relating to the procedure, 
associative method which permits storing of data of the function or sub-routine obtained during a process of 
primary storage device to any of the cache line on the cache transformation of the source program into the object 
memory, a set associative method as an intermediate method program, and 

of the direct map method and the fully associative method third process of generating the object program, on the 

and a plurality of cache lines of the cache memory to be 50 basis of the result of arrangement, 

arranged the data of the primary storage device are present, in the preferred construction, the procedure, function or 

and so forth may be used in addition to the direct map sub-routine is at least one of that defined by a user in the 

method. As set forth above, it is possible to cause conflict of source program, that defined and inspected by the user, that 

procedures on the cache memory. preliminarily prepared in a processing system in the pro- 
In the conventional program transformation systems dis- 55 gramming language and that preliminarily prepared in a 

closed in Japanese Unexamined Patent Publication No. form of instruction code. 

Heisei 1-118931 and Japanese Unexamined Patent IHiblica- In another preferred construction, the information is 
lion No. Heisei 9-34725, no consideration has been given for obtained by execution of a temporary object program trans- 
conflict as set forth above. Therefore, as a result of arrange- formed from the source program and is consisted of infor- 
ment optimization of the instmclion code of the procedure or 60 mation indicative of number of times that the procedure, 
arrangement of the procedure in the special space, if the function or sub-routine is actually called and information 
instruction codes of two procedures having high possibility indicative of call relationship between procedures, functions 
to be executed sequentially in time are loaded on the same or sub-routines. 

cache line of the cache memory 171, conflict is inherent. In another preferred construction, the information is 
Accordingly, even if the code size of the overaU object 65 obtained by execution of a temporary object program trans- 
program can be deleted, execution speed of the CPU cannot formed from the source program and is consisted of infor- 
be accelerated. mation indicative of number of times that the procedure. 
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function or sub-routine is actually called and information 
indicative of call relationship between procedures, functions 
or sub-routines, 
in the second process, 

the procedures, functions or sub-routines are divided into 5 

a plurality of groups on the basis of call frequency, and 
the procedures, functions or sub-routines are arranged in 
the storage region corresponding to cache lines of a 
cache memory among the storage region of the primary 
storage device. ^0 
According to the second aspect of the invention, a pro- 
gram transformation method for transforming a source pro- 
gram described by a programming language into an object 
program described by a language executable by a data 
processing system, comprises 

first process of transforming at least a part of procedure, 
function or sub-routine used in the source program into 
a form for storing in an arbitrary storage region of a 
primary storage device of the data processing system 
when the object program is used in the dala processing 
system, 

second process of transforming the source program into 
the object program, and in conjunction therewith, and 
concerning the object program, transforming 
procedure, function or sub-routine defined by a user in 
the source program into a form storable in arbitrary 
region of the primary storage device, 

third process of linking the procedure, function or sub- 
routine transformed into the first process and the object 
program obtained in the second process, 

fourth process of collecting dynamic information con- 
sisted of information indicative of number of times that 
the procedure, function or sub-routine is actually 
called, and information indicative of call relationship 
between the procedure, fixnction or sub-routine with 
executing the object program obtained through the third 
process, 

fifth process of arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic infonmation, 
and 

sixth process of generating a final object program by 45 
Unking the procedure, function or sub-routine trans- 
formed in the first process and the object program 
obtained in the second process, on the basis of the 
arrangement information. 
In the preferred construction, the procedure, function or 50 
sub -routine is at least one of that defined by a user in the 
source program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in the pro- 
gramming language and that preliminarily prepared in a 
form of instruction code. 55 
In another preferred construction, in the second process, 
the procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 
the procedures, functions or sub-routines are arranged in 
the storage region corresponding to cache fines of a 60 
cache memory among the storage region of the primary 
storage device. 
According to the third aspect of the invention, a program 
transformation method for transforming a source program 
described by a programming language into an object pro- 65 
gram described by a language executable by a data process- 
ing system, comprises 



first process of transforming the source program into a 
temporary object program, and in conjunction 
therewith, upon executing the temporary object 
program, inserting a code for counting number of times 
that the procedure, function or sub-routine is actually 
called, 

second process of linking one of the procediire, function 
or sub-routine that defined by a user in the source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in the 
programming language and that preliminarily prepared 
in a form of instruction code, with the temporary object 
program obtained through the first process, 

third process of collecting dynamic information consisted 
of information indicative of number of times that the 
procedure, function or sub-routine is actually called, 
and information indicative of call relationship between 
the procedure, function or sub-routine with executing 
the object program obtained through the second 
process, 

fourth process of arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 
and 

fifth process of transforming at least part of one defined by 
a user in the source program, one defined and inspected 
by the user, one prehminarily prepared in a processing 
system in the programming language and one prelimi- 
narily prepared in a form of instruction code among the 
procedure, function or sub-routine to be used in the 
source program into a form storable in an arbitrary 
storage region of the primary storage region, in which 
the object program is stored as actually used in the data 
processing system, 

sixth process, after transforming the source program into 
the object program, concerning the object program, 
transforming procedure, function or sub-routine 
defined by a user in the source program into a form 
storable in arbitrary region of the primary storage 
device, 

seventh process of generating a final object program by 
linking the procedure, function or sub-routine trans- 
formed in the fifth process and the object program 
obtained in the sixth process, on the basis of the 
arrangement information. 

In the preferred construction, in the fourth process, 

the procedures, functions or sub-routines are divided into 
a pluraHty of groups on the basis of call frequency, and 

the procedures, functions or sub-routines are arranged in 
the storage region corresponding to cache lines of a 
cache memory among the storage region of the primary 
storage device. 

According to the fourth aspect of the invention, a program 
transformation system for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprises 

procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in the 
source program into a form so that the object program 
can be stored in an arbitrary storage region of a primary 
storage device of the data processing system. 
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Optimizing meaas for arranging procedure, function or 
sub-routine transformed or not transformed in the pro- 
cedure transforming means in the storage region cor- 
responding to cache line of a cache memory among 
storage region of the primary storage device without 
causing cache conflict on the basis of information 
relating to the procedure, function or sub-routine 
obtained during a process of transformation of the 
source program into the object program, and 
generating means for generating the object program, on 

the basis of the result of arrangement. 
In the preferred construction, the procedure, function or 
sub-routine is at least one of that defined by a user in the 
source program, that defined and inspected by the user, that 
preliminarily prepared on a processing system in the pro- 
gramming language and that preliminarily prepared in a 
form of instruction code. 

In another preferred construction, the information is 
obtained by execution of a temporary object program trans- 
formed firom the source program and is consisted of infor- 
mation indicative of number of times that the procedure, 
function or sub-routine is actually called and information 
indicative of call relationship between procedures, functions 
or sub-routines. 

In another preferred construction, the information is 
obtained by execution of a temporary object program trans- 
formed from the source program and is consisted of infor- 
mation indicative of number of times that the procedure, 
function or sub-routine is actually called and information 
indicative of call relationship between procedures, functions 30 
or sub-routines, 

in the optimizing means, 

the procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

the procedures, ftinctions or sub -routines are arranged in 
the storage region corresponding to cache lines of a 
cache memory among the storage region of the primary 
storage device. 

According to the fifth aspect of the invention, a program 
transformation system for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprise 

procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in the 
source program into a form for storing in an arbitrary 
storage region of a primary storage device of the data 
processing system when the object program is used in 
the data processing system, 

program transforming means for transforming the source 
program into the object program, and in conjunction 
therewith, and concerning the object program, trans- 
forming procedure, function or sub-routine defined by 
a user in the source program into a form storable in 
arbitrary region of the primary storage device, 

linking means for linking the procedure, function or 
sub-routine transformed into the procedure transform- 
ing means and the object program obtained in the 
program transforming means, 

dynamic information collecting means for collecting 
dynamic information consisted of information indica- 
tive of nimiber of times that the procedure, function or 
sub-routine is actually called, and information indica- 
tive of call relationship between the procedure, func- 
tion or sub-routine with executing the object program 
obtained through the linking means, and 
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optimizing means for arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 

the linking means generating a final object program by 
linking the procedure, function or sub-routine trans- 
formed in the procedure transforming means and the 
object program obtained in the program transforming 
means, on the basis of the arrangement information. 

According to the sixth aspect of the invention, a program 
transformation system for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprises 

program transforming means for transforming the source 
program into a temporary object program, and in con- 
junction therewith, upon executing the temporary 
object program, inserting a code for counting number 
of times that the procedure, function or sub- routine is 
actually called, 

linking means for linking one of the procedure, function 
or sub-routine that defined by a user in the source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in the 
programming language and that preliminarily prepared 
in a form of instruction code, with the temporary object 
program obtained through the program transforming 
means, 

dynamic information collecting means for collecting 
dynamic information consisted of information indica- 
tive of number of times that the procedure, function or 
sub -routine is actually called, and information indica- 
tive of call relationship between the procedure, func- 
tion or sub-routine with executing the object program 
obtained through the linking means, 

optimizing means for arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 
and 

procedure transforming means for transforming at least 
part of one defined by a user in the source program, one 
defined and inspected by the user, one preliminarily 
prepared in a processing system in the programming 
language and one preliminarily prepared in a form of 
instruction code among the procedure, function or 
sub-routine to be used in the source program into a 
form storable in an arbitrary storage region of the 
primary storage region, in which the object program is 
stored as actually used in the data processing system, 

the program transforming means transforming the source 
program into the object program, concerning the object 
program, transforming procedure, function or sub- 
routine defined by a user in the source program into a 
form storable in arbitrary region of the primary storage 
device, and the linking means generating a final object 
program by linking the procedure, function or sub- 
routine transformed in the procedure transforming 
means and the object program obtained in the program 
transforming means, on the basis of the arrangement 
information. 

According to another aspect of the invention, a computer 
readable memory storing a language processing program for 
transforming a source program described by a programming 
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language into an object program described by a language FIG. 12 is a block diagram showing a construction of the 

executable by a data processing system, the language pro- second embodiment of a program transformation system 

cessing program comprises according to the present invention; 

first process of transforming at least a part of procedure, FIG. 13 is a block diagram showing a construction of the 
function or sub-routine used in the source program into 5 third embodiment of a program transformation system 

a form so that the object program can be stored in an according to the present invention; 

arbitrary storage region of a primary storage device of FIG. 14 is a flowchart showing operation of the third 

the data processing system, embodiment of the program transformation system of FIG. 

second process of arranging procedure, function or sub- 13; 

routine transformed or not transformed in the first FIG. 15 is a block diagram showing a first example of the 

process in the storage region corresponding to cache construction of the conventional program transformation 

line of a cache memory among storage region of the system; 

primary storage device without causing cache conflict piG. 16 is a block diagram showing a second example of 

on the basis of information relating to the procedure, the constmction of the conventional program transformation 

function or sub-routine obtained during a process of system; 

transformation of the source program into the object pj^; 17 ^n illustration for explaining a relationship 

program, and between a cache memory and a primary storage device in a 

third process of generating the object program, on the direct map method; 

basis of the result of arrangement. pjc. is is an illustration showing one example of the case 

Other objects, feaUires and advantages of the present ^^ere a source program to be used in the prior art is 

invention will become clear from the detailed description expressed by C language; and 

given herebelow. explanatory illustration for explaining 

BRIEF DESCRIPTION OF THE DRAWINGS conflict between procedure on a cache memory. 

The present invention wiU be understood more fuUy from DESCRIPTION OF THE PREFERRED 

the detailed description given herebelow and from the EMBODIMENT 

accompanying drawings of the preferred embodiment of the The present invention will be discussed hereinafter in 

present invention, which, however, should not be taken to be detail in terms of the preferred embodiment of the present 
Umitative to the invention, but are for explanation and 30 invention with reference to the accompanying drawings. In 

understanding only. foUowing description, numerous specific details are set 

In the drawings: forth in order to provide a thorough understanding of the 

FIG. 1 is a block diagram showing a construction of the present invention. It will be obvious, however, to those 

first embodiment of a program transformation system skilled in the art that the present invention may be practiced 

according to the presem invention; ^^jj^ut these specific details. In other instance, well-known 

nc. 2 is an illustration showing one example of a source stmctures are not shown in detail in order to avoid unnec- 

program to be used in the first embodiment of the program ^^^^^^^ ^^^^^ ^^^^^ invention. 

transformation system of FIG. 1; yc- « o u 4\ 

* (First Embodiment) 

FIG. 3 is a flowchart showing operation of the first 40 piG. 1 is a block diagram showing a construction of the 

embodiment of the program transformation system of fi^st embodiment of a program transformation system 

FIG. 4 is a flowchart showing an arrangement optimizing according to the present invention, 

process of a procedure of an optimizing portion in the first The shown embodiment of a program transformation 

embodiment of the program transformation system of FIG. system is generally constructed with first to fourth program 
1; 45 storage portions 31 to 34, a compiler 35, a linker 36, a 

FIG. 5 is an illustration showing one example of number profiler 37, first and second information storage portions 38 

of cache lines to be occupied by procedures A to G; and 39, an optimizing portion 40, first and second library 

FIG. 6 is an iUustration showing one example of a storage portions 41 and 42, and a library generating portion 
procedure call graph to be generated by the optimizing 

portion; 50 The first program storage portion 31 is consUiicted with a 

HG.' 7 is an explanatory illustration for explaining the ^^^^^ ^""^ semiconductor menaorymcM 

arrangement optimization process of the procedures in the ^.^M, ^^M or so forth a hD (floppy disk), a HD (hard 

optimizing portion in the first embodiment of the program '^^^^^ CD-ROM or the like, in the first program storage 

transformation system shown in FIG. 1; f^^^^°" ^ f P^^S^^'" descnbed by a programming 
o • 11 • 1 r 55 language, such as C language or SO forth, is stored prelimi- 

FIG, 8 IS an illustration showmg one example of an ^^^^ ^^^^^ embodiment, discussion wiU be given 

arrangement information; ^^^^^ ^ ^ is used as the programming 

no. 9 is an explanatory illustration for explaining draw- language, 

back to be caused when the arrangement optimizing process -j^g compiler 35 compiles the source program into a 

of the procedure is not performed; rearrangeable object program arid thereafter transforms into 

FIG. 10 is an illustration showing the case where proce- rearrangeable object program which can be arranged, per a 

dures C and D in the procedure call graph of FIG. 6 are procedure to store in the second program storage portion 32. 

standard library procedure; Here, rearrangeable object program is an object program 

FIG. 11 is an illustration for explaining drawback in the which can be stored in any storage regions of the primary 
case where the standard library procedure is placed out of 65 storage device. Also, arrangeable per procedure means that 

object for the arrangement optimizing process of the proce- arrangement of the procedure can be done within the rear- 

dure; rangeable object program. 
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It should be noted that, in the shown embodiment, the 
procedure generally represents not only the procedure as 
originally means but also function and sub-routine, as set 
forth above. In the procedure, a user procedure, a user 
library procedure, a standard library procedure, a run-time 
library procedure and so forth are included. 

Here, the user procedure is a procedure defined by the 
user in the source program. For example, when the user 
prepares the source program shown in FIG. 2, procedures 
func, fund and func2 are all user procedures. 

The user library procedure is originally the user 
procedure, and are considered to have high general appli- 
cability and thus are stored in the first library storage portion 
41 after inspection, such as debugging or so forth. For 
example, when the source program is stored in the first 
library storage portion 41 with process, such as debugging 
or so forth after compiling the source program shown in 
FIG. 2 into the rearrangeable object program, all of the 
procedures func, fund and func2 become user library 
procedure. The standard library procedure is the procedure 
preliminarily prepared in a processing system, such as 
compiler or so forth in the programming language describ- 
ing the source program and can be used without definition by 
the user. For example, in C language, a procedure printf for 
outputting a character string as standard output, a procedure 
strlen returning a length of the character string and so forth 
are the standard library procedures. 

The run-time library procedure is the procedure in a form 
preliminarily described by instruction code for large code 
size while general applicability is high and prehminarily 
stored in the first library storage portion 41. An instruction 
string having high general applicability and having large 
code size should lower efficiency if the compiler 35 gener- 
ates the instruction code every time of generation of the 
object program. Therefore, such Instruction string is pre- 
hminarily established as the procedure described by the 
instruction code so that a code calling such procedure is 
generated upon generation of the object program. Then, such 
procedure is linked by the linker 36 later. For example, float 
type parameter or operation is described in the source 
program despite of the fact that the CPU or the like execut- 
ing the final object program does not have instruction of 
floating decimal point, the compiler 35 generates the object 
program using the procedure consisted of a plurality of 
instruction strings, such as float procedure add, float proce- 
dure sub and so forth. The float procedure add or float 
procedure sub are run-time library procedures. 

The second program storage portion 32 is constructed 
with the storage medium, such as semiconductor memory 
including RAM or the like, FD, HD and so forth and stores 
rearrangeable object program arrangeable per procedure. 

The linker 36 establishes a link between the rearrangeable 
object program arrangeable per procedure and stored in the 
second program storage portion 32 and a rearrangeable 
library (which wiU be discussed later) arrangeable per 
procedure stored in the second library storage portion 42, to 
generate an executable temporary object program to store in 
the third program storage portion 33. In conjunction 
therewith, on the basis of the arrangement information 
(which will be discussed later) stored in the second infor- 
mation storage portion 39, a link is established between the 
rearrangeable object program arrangeable per procedure and 
the rearrangeable library arrangeable per procedure to gen- 
erate an executable final object program to store in the fourth 
program storage portion 34. 
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The third program storage portion 33 is constructed with 
a storage medium, such as a semiconductor memory includ- 
ing RAM or so forth, FD, HD or so forth and stores the 
temporary object program. The fourth program storage 

5 portion 34 is constructed with a storage medium, such as a 
semiconductor memory including RAM or so forth, FD, HD 
or so forth and stores the final object program. 

The profiler 37 is consisted of a hardware emulator, a 
software emulator and so forth and collects dynamic infor- 

10 mation (profile information) consisted of call relationship 
between procedures, caU count of respective procedures, 
loop structure information and so forth with executing the 
temporary object program read out from the third program 
storage portion 33. Then, the dynamic information thus 

15 obtained is stored in the first information storage portion 38. 
Here, the loop structure information is information indi- 
cating that a certain procedure is called in the look structure. 
Particularly, certain makers which enables recognition of 
starting and end of the look structure during operation of the 

20 profiler 37, are written in the loop structure of the source 
program. Then, recognizing the start and end of the look 
structure with the markers by the profiler 37, it can be 
recognized that the procedure called in the loop structure is 
in the loop structure. By this, it can be judged that possibility 

25 of sequential execution of the procedures within the loop 
structure is high. 

The first information storage portion 38 is constructed 
with a storage medium, such as a semiconductor memory 
including RAM or so forth, FD, HD and so forth, and stores 

30 a dynamic information. 

The optimizing portion 40 performs arrangement optimi- 
zation of all procedure on the basis of the dynamic infor- 
mation stored in the first information storage portion 38 for 
avoiding conflict of the procedures having high possibility to 
be executed sequentially in time axis on the cache memory 
and for avoiding cache miss of the frequently used proce- 
dure. Also, the optimizing portion 40 generates an arrange- 
ment information for designating arrangement of the proce- 

40 dure to the linker 36 and stores the arrangement information 
thus generated in the second information storage portion 39. 

The second information storage portion 39 is constructed 
with a storage medium, such as a semiconductor memory 
including RAM or so forth, FD, HD and so forth, aid stores 

45 the arrangement information. 

The first library storage portion 41 is constructed with a 
storage medium, such as a semiconductor memory including 
ROM, RAM or so forth, FD, HD, CD-ROM or so forth and 
stores the rearrangeable libraries including the standard 

50 library procedure, the run-time library procedure and the 
user library procedure. Here, the rearrangeable library is 
rearrangeable object program. However, in order to distin- 
guish from the rearrangeable object program generated by 
the compiler 35, the rearrangeable object program stored in 

55 the first library storage portion 41 is referred to as the 
rearrangeable library. 

The library generating portion 43 transforms the rear- 
rangeable hbrary stored in the first library storage portion 41 
into the rearrangeable library arrangeable per procedure to 

60 store in the second library storage portion 42. The second 
library storage portion 42 is constructed with the storage 
medium, such as the semiconductor memory including 
RAM or so forth, FD, HD or so forth and stores rearrange- 
able hbrary arrangeable per procedure. 

65 Next, operation of the program transformation system 
having the construction set forth above will be discussed 
with reference to FIGS. 3 to 10. 
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At first, at step 301 shown in FIG. 3, the library geoe rating At step 306, the iinker 36 estabiishes a link between the 

port on 43 transforms each rearrangeable library scored in rearrangeable object program arrange able per procedure and 

the first library storage portion 41 into rearrangeable library the rearrangeable library arrangeable per procedure to gen- 

arrangeable per procedure, recognizing the procedure unit to grate the executable final object program to store in the 

store in the second library storage portion 42. 5 fourth program storage portion 34. Thereafter, a sequence of 

A pluraUty of procedures as one group in one rearrange- process is terminated 

able library normally are included in a section, such as text ^^xt, arrangement optimization process of the procedure 

section (.text, section) which is a unit of arrangement. These ^i^^ optimizing portion 40 will be discussed with reference 

procedures are aggregated together as the text section upon FIGS 4 to 10 

linking in the linker 36. Each individual procedure cannot be r™. • 1 • j r 

. , , ^1 K TT_ r u 10 There are vanous kinds of arrangement optimization 

arranged independently per the procedure. Therefore, by j r *l j c a- • .1 • .i_ l 

dividing each individual Action into sections per individual '"^^^^ f procedure for efficiently using the cache 

procedure, each procedure can be appropriately arranged ^1^°^ einbodmient, arrangement me hod by 

upon establishing link by the linker 36. ^^^^^^ '^^^ coloring disclosed m A. H. Hasemi, et al, "Effi- 

The process for dividing the text section within the ^^^"^ Procedure Mapping Using Cache Line Colonng", 

rearrangeable library into the sections per the procedure will ^5 SIGPLAN, pp 171-182, June, 1997, is employed, 

be discussed hereinafter. At first, as a premise, each code of the object program 

At first, since a global attribute indicating that the own generated by the program transformation system and stored 

procedure is useful externally and symbol information relat- in the primary storage device is read out from the primary 

ing to attribute of the procedure and so forth are added at the storage device and thereafter, stored in the cache memory 

leading end of the procedure, these global attribute and the 20 consisted of four cache lines by the direct map method, 

symbol information are recognized as a leading label for In the source program to be compiled, seven procedures 

making reference to a leading address. A to G are described in sequential order. When the source 

Next, on the basis of recognition of the leading label of program is compiled into the object program, the code sizes 

each procedure, a section having distinct name per each of respective procedures A to G, number of cache lines to be 

procedure, such as "procedure name„source program, 25 occupied (cache Une number) forming the cache memory is 

name" and so forth is newly generated. Then, information shown in FIG. 5. 

relating to the sections in the rearrangeable library is con- On the other hand, as result of dynamic parsing in the 

centrically and newly registered in a certain portion, such as profiler 37, it is assumed that fi-equency of call from the 

a section header portion, in the rearrangeable library. If the procedure A to the procedure B is "90*', frequency of call 

text section is not necessary in the rearrangeable library, the 30 from the procedure B to the procedure C is "80", frequency 

relevant information is deleted. of call from the procedure C to the procedure D is "70, 

Since the rearrangeable library has an offset indicative of firequency of call from the procedure A to the procedure E is 

the position of various kinds of information in the rearrange- "40", frequency of call from the procedure E to the proce- 

able library at respective portions, when the new section is dure C is "100", frequency of call from the procedure E to 

added as set forth above, error is inherently caused in 35 the procedure F is "0", and frequency of call from the 

correspondence between the information and offset. procedure F to the procedure G is "0". 

Therefore, offiset has to be updated. Each rearrangeable The arrangement method by cache line coloring reduces 

library processed as set forth above is stored in the second conflict on the cache memory in one generation (relationship 

program storage portion 32 as rearrangeable hbrary arrange- of direct call from one procedure to the other procedure) 

able per procedure. 40 using a procedure call graph which will be discussed later. 

At step 302, the compiler 35 compiles the source program In the arrangement method, "color" is assigned for each 

into the rearrangeable object program, and thereafter, per- cache line, and arrangement of the procedure is performed 

forms the process similar to the process of the hbrary using number of "colors" required for arrangement, namely 

generating portion 43 at step 301 for transforming the cache Une number, "colors" on which the procedures are 

rearrangeable library into the rearrangeable library arrange- 45 arranged, and non-use able groups. 

able per procedure to store in the second program storage In the shown embodiment, read (r) is assigned for the first 

portion 32. At step 303, the linker 36 establishes a link cache line, green (g) is assigned for the second cache line, 

between the rearrangeable object program arrangeable per blue (b) is assigned for the third cache line, and yellow (y) 

procedure stored in the second program storage portion 32 is assigned for the fourth cache line, respectively. The 

and the rearrangeable hbrary arrangeable per procedure so non-useable group are procedures in a relationship to call 

stored in the second library storage portion 42 to generate and to be called directly, and is referred to an aggregated 

executable temporary object program to store in the third group of the "colors" occupied by the already arranged 

program storage portion 33. procedures. 

At step 304, the profiler 37 collects dynamic information Al first, at step 401 shown in FIG. 4, the optimizing 

consisted of call relationship between respective procedures, 55 portion 40 generates the procedure call graph as shown in 

call counts of respective procedures, loop structure infor- FIG. 6 on the basis of the dynamic information stored in the 

mation and so forth with executing the temporary object first information storage portion 38. In FIG. 6, nodes A to G 

program read out from the third program storage portion 33. represent procedures. Lines between the nodes represent call 

The profiler 37 stores the dynamic information thus obtained relationship of the procedures. Numerical values added for 

in the first information storage portion 38. 60 the lines represent call frequency of the call from start point 

At step 305, the optimizing portion 40 performs arrange- nodes, namely the procedures at the routes of arrows, to end 
ment optimization for all procedures on the basis of the point nodes, namely the procedures at the tip of the arrows, 
dynamic information stored in the first information storage At step 402, concerning the procedure call graph, lines 
portion 38 to generate the arrangement information to store and nodes are divided into a group having high call f re- 
in the second information storage portion 39. Detail of the 65 quency and a group having low call frequency. In the shown 
arrangement optimization of the procedure will be discussed embodiment, as can be appreciated from FIG. 6, the group 
later. having high call frequency is consisted of nodes A to E, the 
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line from the node A to the node B, the hne from the node 
A to the node E, the line from the node B to the node C, the 
line from the node C to the node D and the line from the 
node E to the node C. On the other hand, the group having 
low call frequency is consisted of the nodes F and G, the line 5 
from the node E to the node F and the line from the node F 
to the node G. 

At step 403, the lines and the nodes are rearranged within 
each of the divided group. Namely, in the group having high 
call frequency, rearrangement is performed in descending 
order from the larger numerical values added to the Unes. In 
contrast to this, in the group having low call frequency, 
rearrangement is performed in descending order from larger 
cache line number of the procedure and are arranged mainly 
for filling up void in the program space. 15 

In the shown embodiment, as can be appreciated from 
FIG. 6, in the group having high call frequency, the lines are 
arranged in the sequential order of the line from the node E 
to the node C, the line from the node A to the node B, the 
line from the node B to the node C, the line from the node 20 
C to the node D and the line from the node A to the node E. 
On the other hand, in the group having low call frequency, 
as can be appreciated from FIG. 5, since the cache line 
number of the procedure G is 2 and the cache line number 
of the procedure F is 1, the nodes are arranged in sequential 
order of the node G and then the node F. 

At step 404, judgment is made whether the line of the 
group of high call frequency is left or not. When the result 
of judgment is positive ("YES"), the process is advanced to 
step 405. At this time, since the process is executed at the 
first time, all lines are left. Therefore, the result of judgment 
becomes "YES". 

At step 405, check is performed whether the nodes at both 
ends of the line at the highest order in the rearranged order ^5 
among remaining lines are not yet arranged or not. If the 
result is YES, the process is advanced to step 406. In the 
shown case, the line of the highest order among the remain- 
ing lines is the line from the node E to the line C, and the 
process is executed at the first time, the nodes E and C at ^ 
both ends are not yet arranged. Accordingly, the result of 
judgment becomes "YES". 

At step 406, after arranging the nodes at both ends of the 
objective line adjacent with each other, the process is 
advanced to step 407. In this case, the nodes at both ends of 45 
the objective line can be arranged at arbitrary position in the 
program space. In this case, the cache line number of the 
procedures E and C are both 2, as can be appreciated from 
FIG. 5. As shown in the first row of FIG. 7, portions El and 
E2 of the procedure E are arranged on the first and second 
cache lines (colors are red (r) and green (g)). Portions CI and 
C2 of the procedure C are arranged on the third and fourth 
cache lines (colors are blue (b) and yellow (y)). In this case, 
the nodes E and C are considered to be marged to form a 55 
single node. Such single node will be referred to as com- 
posite node E-C. 

At step 407, after updating the groups which cannot be 
used, the process is returned to step 404. 

In case of the node E, the "colors" of the cache lines, on 60 
which the node C is in a relationship to be directly called by 
the node E, are blue (b) and yellow (y), the non-useable 
group becomes E{b, y}. Similarly, in case of the node C, the 
"colors" of the cache lines, on which the node E is in a 
relationship to be directly called by the code C are red (r) and 65 
green (g). Therefore, the non-useable group becomes 
C{r. g}. 
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The processes of the steps 404 to 407 set forth above are 
repeated until no line which has the nodes at both ends left 
being not arranged, is lefr among the lines in the group 
having high call frequency. If not line is left in the group 
having high call frequency, the result of judgment at step 
404 becomes "NO". Then, process is advanced to step 416. 
In this case, since the line from the node A to the node B is 
left and the nodes A and B at both ends of the hne are not 
yet arranged, the processes at steps 406 and 407 are per- 
formed. 

As can be appreciated from FIG. 5, both of the procedures 
A and B have cache line number 1. Therefore, as shown in 
the first row of FIG. 7, the procedure A is arranged on the 
third cache line (color is blue (b)), and the procedure B is 
ananged on the fourth cache hne (color is yellow (y)). Then, 
the nodes A and B becomes a composite node A-B. Next, in 
case of the node A, the "color" of the cache line, on which 
the node B in a relationship to be directly called by the node 
A is yellow (y), the non-useable group becomes A{y}. 
Similarly, in case of the node B, the "color" of the cache line, 
on which the node A in a relationship to be directly called by 
the node B is blue (b), the non-useable group becomes B{b}. 

It should be noted that, in the procedure call graph shown 
in FIG. 6, despite of the fact that the line from the node A 
to the node E is left, since the cache line of the colors red (r) 
and green (g), on which the node E is arranged in the 
non-useable group of the node A, are not included. This is 
because the line from the node A to the node A has not been 
processed for low order of the call frequency. In the current 
status, conflict can be caused in connection with the Une 
from the node A to the node E, process is performed 
according to the order of the original Une. Thus, the current 
status is acceptable. 

On the other hand, when the line in the group having high 
call frequency is left but the node on either side of the left 
hne has afready being arranged, the result of judgment at 
step 405 becomes "NO". The process is advanced to step 
408. 

At step 408, check is performed whether the line as an 
object of process is the line connecting nodes in two 
different composite nodes. If the result of checking at step 
408 is YES, the process is advanced to step 409. In the 
current condition, since the hne directed from the node B to 
the node C, which line has the highest order among the 
remaining lines is the line connecting the composite node 
E-C and the composite node A-B, the result of judgment of 
the step 408 becomes YES. Therefore, the process is 
advanced to step 409. 

At step 409, concerning the line as process object, two 
composite nodes are marged into a single composite node. 
This is done by coupling the composite node having smaller 
number of marged nodes (hereinafter referred to as "shorter 
composite node") among two composite nodes to the com- 
posite nodes having greater number of marged nodes 
(hereinafter referred to as "longer composite node). Upon 
coupling the shorter composite node to the longer composite 
node, the shorter composite node is coupled to the longer 
composite node even in the program space. 

At first, it is determined which side of the longer com- 
posite node, the shorter composite node is to be arranged. 
Particularly, judgment is made among the nodes consisting 
the longer composite node, to which of the left and right 
boundary of the longer composite node, the center position 
of the nodes consisting the line to be the object for process 
is inclined, is judged by the cache line number required for 
reaching the left and right boundaries. Then, the shorter 
composite node is determined to be arranged on the side 
toward which the center position is inclined. 
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Next, an orientation to arrange the shorter composite node 
is determined and arranged. Particularly, the orientation of 
the shorter composite node is determined so that, among a 
plurality of nodes consisting the line to be object for process, 
the node other than the nodes consisting the longer com- 
posite node can be arranged as close as possible to the 
akeady arranged node in the longer composite node. In this 
case, if conflict is caused by arrangement of the shorter 
composite node, the positions of the nodes other than the 
nodes consisting the longer composite node are shifted away 
from the nodes consisting the longer composite node until 
the conflict is resolved. However, when the conflict cannot 
be avoided at any arrangement position of the nodes other 
than the nodes consisting the longer composite node, the 
arrangement positions of the nodes other than the nodes 
consisting the longer composite node are returned to the 
initial arrangement positions. Then, process is advanced to 
step 410. 

In the shown embodiment, both of the composite nodes 
E-C and A-B have two marged nodes. Therefore, both of the 
composite nodes can be taken as shorter composite nodes. 
However, in the shown case, the composite node A-B is 
taken as the shorter composite node. 

Among the nodes E and C consisting the longer compos- 
ite node E-C, the center position of the node C forming the 
Une directed from the node B to the node C, which line is 
object for process, is located between the portions CI and 
C2 as shown in the first row of FIG. 7. Therefore, number 
of cache lines required for reaching to the left side boundary 
of the longer composite node E-C is three and whereas 
number of cache lines required for reaching to the right side 
boundary of the longer composite node E-C is one. 
Accordingly, the shorter composite node A-B is arranged on 
the right side of the longer composite node E-C. 

Next, orientation of the shorter composite node A-B is 
determined so that among the nodes B and C forming the 
hne directed from the node B to the node C, which line is an 
object for process, the node B other than the node C 
consisting the longer composite node E-C, is located as close 
as possible to the node C which has already been arranged. 
Thus, the composite node becomes B-A. Since no conflict is 
caused in the arrangement set forth above, the arrangement 
is maintained as is (see second row of FIG. 7). By this, new 
composite node E-C-B-A is generated. 

At step 410, check is performed whether a vacant region 
is formed in the program space through the foregoing 
arrangement process or not. If the result of checking at step 
410 is "NO", the process is advanced to step 407. In the 
shown case, since no vacant region is formed, the process is 
advanced to step 407. Then, after updating the non-useable 
group, the process is returned to step 404. 

In case of the node A, since the "color** of the cache line 
on which the node B in relationship of direct call is arranged, 
is arranged, is red (r), the non-useble group becomes A{r}. 
Similarly, in case of the node B, since the "color" of the 
cache line on which the node A in relationship of direct caU, 
is arranged, is green (g), and since the "color** of the cache 
Une on which the node C in relationship of direct call, is 
arranged, is blue (b) and yellow (y), the non-useable group 
becomes B{g, b, y} (see second row of FIG. 7). 

On the other hand, when the result of checking at step 410 
is "YES*', namely when the vacant region in the program 
space is formed through the foregoing arrangement process, 
the process is advanced to step 411. 

At step 411, the node having high order in the group 
having low call frequency is arranged in the vacant region. 
Thereafter, process is advanced to step 407. 
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The processes at steps 404, 405, 408 to 411 and 407 as set 
forth above, are repeated until no line connecting the nodes 
in two mutually different composite nodes in the lines of the 
group having high call frequency and the node on either side 

5 has already been arranged, is found. Then, if no line of the 
group having high caU frequency is left, the result of 
checking at step 404 becomes "NO*'. Then, the process is 
advanced to step 416. 

If there is no line connecting the nodes in two mutually 

10 different composite nodes in the lines of the group having 
high call frequency and the node on either side has already 
been arranged, the result of checking at step 408 becomes 
"NO**. Then, process is advanced to step 412. 

Through the process set forth above, the line from the 

15 node E to the node C, the line from the node A to the node 
B and the line from the node B to the node C are processed. 
Therefore, in the group having high call frequency, the hne 
from the node C to the node D and the Hne from the node 
A to the node E are left. However, these lines are not the line, 

20 in which one of the nodes has already been arranged and 
which does not connect two different composite nodes. 
Therefore, the result of checking at step 408 becomes "NO*'. 
Then, process is advanced to step 412. 
At step 412, check is performed whether one node among 

25 two nodes consisting the Une to be object for process 
consists the composite node, and the other node has not yet 
been arranged, or not. If the result of checking at step 412 
is "YES**, the process is advanced to step 413. In the present 
case, the line from the node C to the node D, which line has 

30 the highest order among the remaining lines, has the node C 
consisting the composite node E-C-B-A and the node D 
which is not yet arranged. Therefore, the result of checking 
at step 412 becomes "YES*'. Then, process is advanced to 
step 413. 

35 At step 413, the non-arranged node of the line as object 
for process is coupled with the composite node. Upon 
coupUng the non-arranged node to the composite node, the 
non-arranged node is also coupled with the composite node 
even on the program space. 

40 At first, it is determined which side of the composite node, 
the non-arranged node is to be arranged. Particularly, judg- 
ment is made among the nodes consisting the composite 
node, to which of the left and right boundary of the com- 
posite node, the center position of the nodes consisting the 

45 Une to be the object for process is incUned, is judged by the 
cache Une number required for reaching the left and right 
boundaries. Then, the non-arranged node is determined to be 
arranged on the side toward which the center position is 
inclined. 

50 In this case, if conflict is caused by arrangement of the 
non-arranged node, the positions of the nodes other than the 
nodes consisting the composite node are shifted away from 
the nodes consisting the composite node until the conflict is 
resolved. However, when the conflict cannot be avoided at 

55 any arrangement position of the nodes other than the nodes 
consisting the composite node, the arrangement positions of 
the nodes other than the nodes consisting the composite 
node are returned to the initial arrangement positions. Then, 
process is advanced to step 410. 

60 Among the nodes E, C, B and A consisting the composite 
node E-C-B-A, the center position of the Une from the node 
C to the node D, which Une is object for process, is located 
between the portions CI and C2 as shown in the first row of 
FIG. 7. Therefore, number of cache lines required for 

65 reaching to the left side boundary of the composite node 
E-C-B-A is three and whereas number of cache Unes 
required for reaching to the right side boundary of the 
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composite node E-C-B-A is three. Accordingly, the non- the highest order among the remaining lines has both the 

arranged node can be arranged on either of left and right side nodes A and E consisting the composite iode E-C-B-A and 

ofthecompositenodeE-C-B-A. In the shown case, the node the node E is not yet arranged. Therefore, the result of 

D is determined to be arranged on left side of the composite checking at step 414 becomes "YES". Then, process is 

node E-C-B-A. 5 advanced to step 415. 

In this case, when portions Dl and D2 of the nodes D are ^s step 415, conflict between the nodes consisting the line 

arranged on immediate left side of a portion El of the node ^s object for process is eliminated. Namely, if conflict is 

E, conflict can be caused between the portions Dl and D2 of ^^^^^ ^^^"^^^ co°s^ti«JS the line as object for 

the node D and the portions CI and C2 of the node C. Pf^^^ nodes located closer to the boundary of 

Therefore, in order to avoid conflict, the Dl and D2 of the lO ^omposiXc node is shifted beyond the boundary until 

J ^ . . J- . r/ I 1- r conflict IS rcsolvcd. However, when conflict cannot be 

node D are airanged at a distance of two cache hnes from the ^^^^^^ j,;^^ „f „^ ^^^^ „„de is returned 

portion El of the node E (see third row of FIG 7). ,^ ^^^^ ^^^^^^^^^ ^^^^^ ^^-^^ j^^^^ ^^^^ ^^^^^^^ 

Next, in such case, vacant region corresponding to two g^^p ^jq 

cache lines is formed on the right side of the portions Dl and the shown embodiment, the Kne as object for process 

D2 of the node D. Then, the result of checking at step 410 is is the line from the node A to the node E. As can be 

becomes "YES". Thus, the process is advanced to step 411. appreciated from the fourth row of FIG. 7, conflict is caused 

At step 411, in the vacant region for two cache lines on the between the nodes A and E. Among the nodes A and E 

right side of the portions Dl and D2 of the node D, the node consisting the line from the node A to the node E, the node 

having the highest order in the group having low call A is closer to the boundary of the composite node E-C-B-A. 

frequency is arranged. In the shown case, the node G is 20 Therefore, the node A is shifted beyond the boundary. In the 

arranged in the vacant region set forth above (see fourth row shown case, since conflict can be avoided by shifting the 

in FIG. 7). Then, the process is advanced to step 407, after node A for one cache line, the node A is arranged at the one 

updating the non-useable group. Thereafter, the process is cache line shifted position (see the fifth row of FIG. 7). 

returned to step 404. In case of the node D, the "colors" of Next, in the shown case, the vacant region of one cache 

the cache lines, on which the node C is in a relationship to 25 line is formed on the Tight side of the node A. Therefore, the 

be directly called by the node D, are blue (b) and yellow (y), result of judgments at step 410 becomes "YES". Then, the 

the non-useable group becomes D{b, y} (see third row of process is advanced to step 411. 

FIG. 7). At step 411, in the vacant region for one cache line on the 
The processes at steps 404, 405, 408, 412, 413, 410, 411 right side of the node A, the node remained in the group 
and 407 as set forth above, are repeated until no line not 30 having low call frequency is arranged. In the present case, 
connecting the nodes in two mutually different composite after arranging the node F (see the sixth row of FIG. 7), the 
nodes but connecting one of the nodes on the opposite sides process is advanced to step 407. After updating the non- 
is the node consisting the composite node and the node on useable group, the process is returned to step 404. In case of 
the other side is not yet arranged in the remaining lines of the the node A, the "colors" of the cache hne, on which the 
group having high call frequency and the node on either side 35 nodes E and B in a relationship to be directly called by the 
has already been arranged, is found. Then, if no line of the node A are arranged, are red (r) and green (g), the non- 
group having high call frequency is left, the result of useable group becomes A{r, g} (see fifth row of FIG. 7). 
checking at step 404 becomes "NO". Then, the process is Similarly, in case of the node B, the "color" of the cache line, 
advanced to step 416, on which the node C in a relationship of directly calling the 
If no line which does not connect the nodes in two 40 node B and the node A in a relationship of being directly 
mutuaUy different composite nodes but does connect one of called by the node B, are ananged, are blue (b) and yellow 
the nodes on the opposite sides is the node consisting the (y). Thus, the non-useable group becomes B{b, y} (see the 
composite node and the node on the other side is not yet fifth row of FIG. 7). 

arranged in the remaining lines of the group having high caU The process of the steps 404, 405, 408, 412, 414, 415, 

frequency and the node on either side has aheady been 45 410, 411 and 407 set forth above is repeated until no line 

arranged, is found. The result of checking at step 412 connecting the nodes in the same composite node is left. If 

becomes "NO". Then, the process is advanced to step 413. no line in the group having high call frequency is left, the 

Through the process set forth above, the line from the result of judgment at step 404 becomes "NO". Then, the 

node E to the node C, the line from the node A to the node process is advanced to step 416. 

B, the line from the node B to the node C and the Une firom 50 At step 416, concerning the remaining nodes in the group 
the node C to the node D are processed. Therefore, in the having low call frequency, arrangement is performed by 
group having high call frequency, only line from the node A simple depth preferential retrieval When a plurality of 
to the node E is left. However, this lines are not the line, composite nodes are arranged away from each other through 
which does not connect the nodes in two mutually different the process set forth above, preference is given for each 
composite nodes but does connect one of the nodes on the 55 composite node on the basis of the cafl frequency to deter- 
opposite sides is the node consisting the composite node and mine final arrangement. Then, a sequence of process goes 
the node on the other side is not yet arranged in the END. 

remaining lines of the group having high call frequency and One example of the arrangement information to be 

the node on either side has already been arranged. Therefore, obtained through the arrangement optimizing process of the 

the result of checking at step 412 becomes "NO", Then, 60 procedures set forth above will be illustrated in FIG. 8. 

process is advanced to step 414. As a premise, it is assumed that the procedures A and B 

At step 414, check is performed whether one node among are included in a source program file of a file name "testl .o", 

two nodes consisting the line to be object for process functions E, F and G are included in a source program file 

consists the composite node, and the other node has not yet of a file name "test2.o", the procedures C and D are standard 

been arranged, or not. If the result of checking at step 412 65 fibrary procedures included in a fibrary file of file name 

is "YES", the process is advanced to step 415. In the present "hbc.a". Also, a size of one cache line is assumed to be 32 

case, the line from the node A to the node E, which line has bytes (0x20). 
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In FIG, 8. "GROUP!" is a segment name which is given procedure, even when the arrangement method by cache line 

in the case when the output section is handled as one group, coloring as disclosed in the above-identified publication, it 

"ILOAD" represents a segment type. This field is fixed. In is not possible to completely avoid conflict for the reason set 

the shown case, "LOAD" represents the segment to be out below. When a call instruction of a plurality of standard 

loaded in the memory. "?RX" represents a segment attribute 5 library procedures are described in the source program, a 

which shows attribute of read/write/execute of the segment. plurality of corresponding standard library procedures are 

In case of an instruction portion (text code), it is fixed at read out from the library storage portion upon establishing 

"?RX". "AOxlOOO" represents alignment condition which a link in the linker and are concentrically arranged in the 

represents the alignment condition upon arrangement of the special region of the primary storage device in the prior art. 

segment in the memory space. In the shown case, there is 10 It has been not possible to designate the arrangement per 

illustrated the case where the alignment condition is procedure. 

"0x1000". FIG. 11 shows a process of the arrangement optimization 

On the other hand, "„D__LIB", "__G„test2" and so forth process of the procedure by the optimizing portion in the 

are output section names which represent groups formed by case where the standard library procedures C and D are 

coupling input sections of the same type and attribute, is placed out of arrangement optimization process of the 

"SPROGBITS" represents the type of the input section. In procedure. In this case, since the standard library procedures 

case of the text code, the type of the input section is fixed to C and D are placed out of arrangement optimization process, 

"SPROGBITS". "?AX" represents a section attribute which the line from the procedure E to the procedure C and the line 

represents attribute of occupy/writeable/executable or so firom the procedure B to the procedure C are naturally placed 

forth of the memory. In case of the text code, it is fixed at 20 out of process. Accordingly, as shown in the fifth row of 

"?AX". FIG. 11, conflict of the procedure E and the procedure C can 

"A0x20" represents an alignment condition upon arrang- be avoided, 

ing the input section in the output section. Since consider- (Second Embodiment) 

ation is given for the arrangement per cache line, the Next, discussion will be given for the second embodiment 

alignment condition is 0x20 as the size of one cache hne. 25 of the program transformation system according to the 

"_D_testl", "_G_test2" and so forth represent names of present invention. 

the input sections to be arranged in the output section. FIG. 12 is a block diagram showing a construction of the 

"libc.a", "test2.0" and so forth represent file names included second embodiment of the program transformation system 

in the input section. When the output section is formed by according to the present invention. In FIG. 12, like elements 

aggregating the same input sections of a plurality of files, it 30 to those illustrated in FIG. 1 will be identified by like 

is possible to describe a plurality of file names. reference nimierals and detailed discussion for such com- 

As set forth above, by providing input section name per mon element will be neglected in order to avoid redundant 

the procedure, the order of arrangement of the procedure can disclosure for keeping the disclosure simple enough to 

be designated with arrangement condition. As set forth facilitate clear understanding of the present invention, 

above, with the construction of the shown embodiment, the 35 In the shown embodiment of the program transformation 

library generating portion 43 for transforming the rearrange- system shown in FIG. 12, the rearrangeable library stored in 

able hbrary into the rearrangeable library arrangeable per the first library storage portion 41 is also supplied to the 

procedure is provided for collecting dynamic information linker 36 in addition to the library generating portion 43. The 

with respect to all procedures upon dynamic parsing by the following is the reason why such construction is provided, 

profiler 37, generates the arrangement information deter- 40 Namely, when number of the rearrangeable libraries is 

mining the optimal arrangement for all of the procedures on large, it takes a long period for transforming all of the 

the basis of the dynamic information and arranges all rearrangeable libraries stored in the first library storage 

procedures on the basis of the arrangement information. portion 41 into the rearrangeable libraries arrangeable per 

Therefore, conflict on the cache memory between all pro- procedtu"e, 

cedures consisting the object program can be reduced. In 45 Therefore, concerning a part of the rearrangeable 
conjunction therewith, cache miss of frequently used pro- libraries, without transforming the rearrangeable libraries 
cedure can be reduced. By this, execution speed can be into the rearrangeable libraries arrangeable per procedure by 
speeded up upon execution of the object program by the the library generating portion 43, direct link to the rear- 
computer or CPU. range able libraries arrangeable per procedure is established 

In this respect, when seven procedures are included in the 50 by the linker 36. 

source program to be compiled and described in the sequen- In this case, the rearrangeable library to be directly 

tial order, number of cache lines of respective procedures A supplied to the linker 36 may be judged by the linker 36 on 

to G when the source program is complied into the object the basis of the dynamic information stored in the first 

program, are as shown in FIG. 5, if the arrangement opti- information storage portion 38 or the code size of each 

mization process of the procedure by the optimizing portion 55 procedure, for example. 

40 is not provided at all, the procedures A to G in the source As set forth above, with the construction set forth above, 

program is compiled into the object program in the the final object program can be generated at a shorter period 

described order. Therefore, conflict is caused between the than that in the first embodiment, 

procedures C and E as shown in FIG. 9. (Third Embodiment) 

On the other hand, in the present invention, without 60 Next, discussion will be given for the third embodiment 

distinguishing the kinds of the procedures, all procedures are of the program transformation system according to the 

handled equally and arrangement is possible per procediu^es, present invention. 

possibility of completely eliminating conflict can be high. In FIG. 13 is a block diagram showing a construction of the 

procedure call graph shown in FIG. 6, as shown in FIG. 10, third embodiment of the program transformation system 

the procedures C and D are standard library procedures. As 65 according to the present invention. In FIG. 13, like elements 

in the prior art, if these procedures C and D are placed to be to those illustrated in FIG. 12 will be identified by like 

out of object for arrangement optimization process of the reference numerals and detailed discussion for such com- 
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mon element wHi be neglected in order to avoid reduudant Ai step 1406, the compiler 35 compiles the source pro- 
disclosure for keeping the disclosure simple enough to gram into the rearrangeable object program and thereafter 
facilitate clear understanding of the present invention. transforms into the rearrangeable object program arrange- 
In the third embodiment of the program transformation able per procedure in the similar process to the process of the 
system, in place of the compiler 35 and the profiler 37 shown 5 library generating portion 43 at step 1405, to store in the 
in FIG. 12. a compiler 44 and a profiler 45 are newly second program storage portion 32. 
provided. step 1407, the linker 36 establishes a link between the 
Different from the profiler 37 shown in FIG. 12. the rearrangeable object program arrangeable per procedure and 
profiler 45 has only function for simply reading the execut- rearrangeable library arrangeable per procedure on the 
able temporary object program stored in the third program basis of the arrangement information stored in the second 
storage portion 33 and executing the read out temporary information storage portion 39 to generate the executable 
object program. On the other hand, the compiler 44 inserts final object program to store in the fourth program storage 
a count code counting number of times of procedures to be portion 34. Thereafter, the sequence of process goes END. 
actually executed upon executing the temporary object pro- With the construction of the shown embodiment set forlh 
gram by the profiler 45 to the temporary object program 15 above, even if the profiler 45 does not have a function for 
upon compiling the source program into the executable collecting dynamic information, substantially the same 
temporary object program. BY this, the profiler 45 can ^^^^1 to the first embodiment can be obtained, 
collect dynamic information consisted of caU relationship , Although the present invention has been iUustrated and 
, . J ^ c J described with respect to exemplary embodiment thereof. It 
between the procedures, call counts of respective procedures , . j u *u i n j • _ *u * .u 
, . f . . f . ^ ,20 should be understood by those skilled in the art that the 
and so forth by executmg the temporary object program, and foregoing and various other changes, omissions and addi- 
stores the obtamed dynamic information in the first mfor- ^^q^s may be made therein and thereto, without departing 
mation storage portion 38. from the spirit and scope of the present invention. Therefore, 
Next, discussion will be given for operation of the third the present invention should not be understood as limited to 
embodiment oil the program transformation system con- 25 the specific embodiment set out above but to include all 
structed as set forth above with reference to FIG. 14. possible embodiments which can be embodied within a 
At first, at step 1401 shown in FIG. 14, the compiler 44 scope encompassed and equivalents thereof with respect to 
complies the source program (see FIG, 2) read out from the the feature set out in the appended claims, 
first program storage portion 31 into the executable tempo- For example, while the present invention has been illus- 
rary object program with inserting the count code to store in 30 trated and discussed in terms of an example as apphed for 
the second program storage portion 32, generating one final object program from one source pro- 
At step 1402, the linker 36 establishes a link between the gram in the foregoing embodiment, the present invention 
temporary object program inserted the count code stored in should not be limited to the disclosed embodiments, 
the second program storage portion 32 and the rearrangeable Namely, the present invention is applicable for the case 
library stored in the first library storage portion 41 to 35 where a plurality of source programs are complied into 
generate an executable temporary object program to store in respective rearrangeable object programs and then thus 
the third program storage portion 33. generated rearrangeable object programs are linked by the 
At step 1403, the profiler 45 executes the temporary linker 36 to generate one final object program, as a matter of 
object program read out from the third program storage course. 

portion 33. In this case, since the count code is inserted in 40 On the other hand, in respective embodiments set forth 

the temporary object program, dynamic information con- above, respective program storage portions 31 to 34, respec- 

sisted of call relationship between the procedures, call count tive information storage portions 38 and 39 and library 

of each procedure and so forth are collected. The obtained storage portions 41 and 42 are constructed with mutually 

dynamic information is stored in the first information stor- different storage media. However, the present invention is 

age portion 38. 45 not limited to the shown embodiments. Namely, respective 

At step 1404, the optimizing portion 40 generates the storage portions may be formed by different storage regions 

arrangement information by performing arrangement opti- of a common storage medium. 

mization for all procedures on the bais of the dynamic In this case, respective program storage portions 31 to 34 

information stored in the first information storage portion 38 and library storage portions 41 and 42 are programs or 

to store in the second information storage portion 39. The rearrangeable Ubraries requiring relatively large storage 

process of the step 1404 is substantially similar to the ^"P"S'A^./.°^ ^^u' ^^^y ^""^^"^^^^^ "^'^ "^/^ 

f * • n * L J- CD-ROM/ On the other hand, smce respective informaUon 

process of the step 305 in the first embodiment. Thus, . * -lo j m j . • • i 1,.^ n 

^ ^ storage portions 38 and 39 are data requinng relatively small 

detailed discussion for this step will be neglected in order to ^^^^^^^ capacity, these information storage portions 38 and 

avoid redundant disclosure for keeping the disclosure simple 55 39 former with the semiconductor memory, such as 

enough to facilitate clear understanding of the present inven- RAM, ROM or so forth. 

tion. On the other hand, respective means in the foregoing 

At step 1405, the library generating portion 43 transforms embodiments are illustrated as hardware construction. The 

each rearrangeable library stored in the first library storage present invention should not be limited to the constructions 

portion 41 into the rearrangeable library arrangeable per 60 shown in the illustrated embodiments. Namely, it is possible 

procedure by recognizing per procedure to store in the to form the program transformation system with a computer 

second library storage portion 42, The process of the step having CPU (central processing unit), an internal storage 

1405 substantially similar to the process at step 301 in the device, such as ROM, RAM or so forth, an external storage 

first embodiment. Thus, detailed discussion for this step will device, such as FDD (floppy disk drive), HDD (hard disk 

be neglected in order to avoid redundant disclosure for 65 drive), CD-ROM drive and so forth, output means and input 

keeping the disclosure simple enough to facilitate clear means, and the foregoing compilers 35 and 44, the linker 36 

understanding of the present invention. and the profilers 37 and 45 may be constructed with CPU, 
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and a program transformation program realizing the fore- 
going functions may be stored in the storage medium, such 
as the semiconductor memory including ROM or the like, 
FD, HD, CD-ROM or so forth. 

In this case, the foregoing internal storage device or the 
external storage device serves as respective program storage 
portions 31 to 34, respective information storage portions 38 
and 39 and the library storage portions 41 and 42. The 
program transforming program is loaded to CPU fixim the 
storage medium to control CPU operation. CPU is respon- 
sive to triggering of the program transforming program to 
serve as the compilers 35 and 44, the linkers 36 and the 
profilers 37 and 45, Under control of the program trans- 
forming program, the foregoing process is executed. 

As set forth above, with the construction of the present 
invention as set forth above conflict between various pro- 
cedures on the cache memory can be avoided. Also, cache 
miss of the frequently used procedure can be prevented. By 
this, upon execution of the computer or CPU, execution 
speed can be accelerated. 

Although the invention has been illustrated and described 
with respect to exemplary embodiment thereof, it should be 
understood by those skilled in the art that the foregoing and 
various other changes, omissions and additions may be 
made therein and thereto, without departing from the spirit 
and scope of the present invention. Therefore, the present 
invention should not be understood as limited to the specific 
embodiment set out above but to include all possible 
embodiments which can be embodies within a scope encom- 
passed and equivalents thereof with respect to the feature set 
out in the appended claims. 

What is claimed is: 

1. A program transformation method for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

first process of transforming at least a part of procedure, 
function or sub-routine used in said source program 
into a form so that said object program can be stored in 
an arbitrary storage region of a primary storage device 
of said data processing system; 

second process of arranging procedure, function or sub- 
routine transformed or not transformed in said first 
process in said storage region corresponding to cache 
line of a cache memory among storage region of said 
primary storage device without causing cache conflict 
on the basis of information relating to said procedure, 
function or sub-routine obtained during a process of 
transformation of said source program into said object 
program; and 

third process of generating said object program, on the 
basis of the result of arrangement. 

2. A program transformation method as set forth in claim 
1, wherein said procedure, function or sub-routine is at least 
one of that defined by a user in said source program, that 
defined and inspected by the user, that preliminarily pre- 
pared in a processing system in said programming language 
and that preliminarily prepared in a form of instruction code. 

3. A program transformation method as set forth in claim 
1, wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of limes that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines. 
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4. A program transformation method as set forth in claim 
1, wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procediu-es, functions or sub-routines, 

in said second process, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache hnes of a 
cache memory among the storage region of said pri- 
mary storage device. 

5. A program transformation method for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

first process of transforming at least a part of procedure, 
function or sub-routine used in said source program 
into a form for storing in an arbitrary storage region of 
a primary storage device of said data processing system 
when said object program is used in said data process- 
ing system; 

second process of transforming said source program into 
said object program, and in conjunction therewith, and 
concerning said object program, transforming 
procedure, function or sub-routine defined by a user in 
said source program into a form storable in arbitrary 
region of said primary storage device; 

third process of linking the procedure, function or sub- 
routine transformed into said first process and the 
object program obtained in said second process; 

fourth process of collecting dynamic information con- 
sisted of information indicative of number of times that 
the procedure, function or sub-routine is actually 
called, and information indicative of call relationship 
between said procedure, function or sub-routine with 
executing the object program obtained through said 
third process; 

fifth process of arranging said procedure, function or 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

sixth process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said first process and the object program 
obtained in said second process, on the basis of said 
arrangement information. 

6. A program transformation method as set forth in claim 
5, wherein said procedure, function or sub- routine is at least 
one of that defined by a user in said source program, that 
defined and inspected by the user, that preliminarily pre- 
pared in a processing system in said programming language 
and that preUminarily prepared in a form of instruction code. 

7. A program transformation method as set forth in claim 
5, wherein in said second process, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 
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8. A program transformation method for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

first process of transforming said source program into a 
temporary object program, and in conjunction 
therewith, upon executing said temporary object 
program, inserting a code for counting number of times 
that said procedure, function or sub-routine is actually 
called; 

second process of linking one of the procedure, function 
or sub-rouline that defined by a user in said source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in said 
programming language and that preliminarily prepared 
in a form of instruction code, with said temporary 
object program obtained through said first process; 

third process of collecting dynamic information consisted 
of information indicative of number of times that the 
procedure, function or sub-routine is actually called, 
and information indicative of call relationship between 
said procedure, function or sub-routine with executing 
the object program obtained through said second pro- 
cess; 

fourth process of arranging said procedure, function or 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

fifth process of transforming at least part of one defined by 
a user in said source program, one defined and 
inspected by the user, one preliminarily prepared in a 
processing system in said programming language and 
one preliminarily prepared in a form of instruction code 
among the procedure, function or sub-routine to be 
used in said source program into a form storable in an 
arbitrary storage region of said primary storage region, 
in which said object program is stored as actually used 
in said data processing system; 

sixth process, after transforming said source program into 
said object program, concerning said object program, 
transforming procedure, function or sub-routine 
defined by a user in said source program into a form 
storable in arbitrary region of said primary storage 
device; 

seventh process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said fifth process and the object program 
obtained in said sixth process, on the basis of said 
arrangement information. 

9. A program transformation method as set forth in claim 
8, wherein in said fourth process, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

10. A program transformation system for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 
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procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in said 
source program into a form so that said object program 
can be stored in an arbitrary storage region of a primary 
5 storage device of said data processing system; 

optimizing means for arranging procedure, function or 
sub-routine transformed or not transformed in said 
procedure transforming means in said storage region 
corresponding to cache line of a cache memory among 
storage region of said primary storage device without 
causing cache conflict on the basis of information 
relating to said procedure, function or sub-routine 
obtained during a process of transformation of said 
15 source program into said object program; and 

generating means for generating said object program, on 
the basis of the result of arrangement. 

11. A program transformation system as set forth in claim 
10, wherein said procedure, function or sub-routine is at 
least one of that defined by a user in said source program, 
that defined and inspected by the user, that preliminarily 
prepared in a processing system in said programming lan- 
guage and that preliminarily prepared in a form of instruc- 
tion code. 

12. A program transformation system as set forth in claim 
10, wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 

3Q ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines. 

13. A program transformation system as set forth in claim 
10, wherein said information is obtained by execution of a 

35 temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines, 
^ in said optimizing means, 

said procedures, functions or sub -routines are divided into 
a plurality of groups on the basis of call frequency, and 
said procedures, functions or sub- routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

14. A program transformation system for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in said 
source program into a form for storing in an arbitrary 

55 storage region of a primary storage device of said data 
processing system when said object program is used in 
said data processing system; 
program transforming means for transforming said source 
program into said object program, and in conjunction 

60 therewith, and concerning said object program, trans- 
forming procedure, function or sub-routine defined by 
a user in said source program into a form storable in 
arbitrary region of said primary storage device; 
linking means for linking the procedure, function or 

65 sub-routine transformed into said procedure transform- 
ing means and the object program obtained in said 
program transforming means; 
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dynamic information collecting means for collecting 
dynamic information consisted of information indica- 
tive of number of times that the procedure, function or 
sub-routine is actually called, and information indica- 
tive of call relationship between said procedure, func- 5 
tion or sub- routine with executing the object program 
obtained through said linking means; and 

optimizing means for arranging said procedure, function 
or sub-routine in said storage region corresponding to 
the cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; 

said linking means generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said procedure transforming means and the 
object program obtained in said program transforming 
means, on the basis of said arrangement information. 

15. A program transformation system as set forth in claim 
14, wherein in said optimizing means, 

said procedures, functions or sub-routines are divided into 
a plurahty of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

16. A program transformation system for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

program transforming means for transforming said source 
program into a temporary object program, and in con- 
junction therewith, upon executing said temporary 
object program, inserting a code for counting nimiber 
of times that said procedure, function or sub-routine is ^5 
actually called; 

linking means for linking one of the procedure, function 
or sub-routine that defined by a user in said source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in said ^ 
programming language and that preliminarily prepared 
in a form of instruction code, with said temporary 
object program obtained through said program trans- 
forming means; 

dynamic information collecting means for collecting 45 
dynamic information consisted of information indica- 
tive of number of times that the procedure, function or 
sub-routine is actually called, and information indica- 
tive of call relationship between said procedure, func- 
tion or sub-routine with executing the object program 50 
obtained through said linking means; 

optimizing means for arranging said procedure, function 
or sub-routine in said storage region corresponding to 
the cache line of the cache memory among the storage 
region of said primary storage device with avoiding 55 
cache conflict, on the basis of said dynamic informa- 
tion; and 

procedure transforming means for transforming at least 
part of one defined by a user in said source program, 
one defined and inspected by the user, one preliminarily 60 
prepared in a processing system in said programming 
language and one preliminarily prepared in a form of 
instruction code among the procedure, function or 
sub-routine to be used in said source program into a 
form storable in an arbitrary storage region of said 65 
primary storage region, in which said object program is 
stored as acmally used in said data processing system; 
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said program transforming means transforming said 
source program into said object program, concerning 
said object program, transfonning procedtu-e, function 
or sub-routine defined by a user in said source program 
into a form storable in arbitrary region of said primary 
storage device, and said linking means generating a 
final object program by linking said procedure, func- 
tion or sub-routine transformed in said procedure trans- 
forming means and the object program obtained in said 
program transforming means, on the basis of said 
arrangement information. 

17. A program transformation system as set forth in claim 
16, wherein in said optimizing means, 

said procedures, fiinctions or sub -routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

18. A computer readable memory storing a language 
processing program for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, said language processing program comprising: 

first process of transforming at least a part of procedure, 
function or sub-routine used in said source program 
into a form so that said object program can be stored in 
an arbitrary storage region of a primary storage device 
of said data processing system; 

second process of arranging procedure, function or sub- 
routine transformed or not transformed in said first 
process in said storage region corresponding to cache 
line of a cache memory among storage region of said 
primary storage device without causing cache conflict 
on the basis of information relating to said procedure, 
function or sub-routine obtained during a process of 
transformation of said source program into said object 
program; and 

third process of generating said object program, on the 
basis of the result of arrangement. 

19. A computer readable memory as set forth in claim 18, 
wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines, 

in said second process, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

20. A computer readable memory storing a language 
processing program for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, said language processing program comprising: 

first process of transforming at least a part of procedure, 
function or sub-routine used in said source program 
into a form for storing in an arbitrary storage region of 
a primary storage device of said data processing system 
when said object program is used in said data process- 
ing system; 
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second process of transforming said source program into 
said object program, and in conjunction therewith, and 
concerning said object program, transforming 
procedure, function or sub-routine defined by a user in 
said source program into a form slorable in arbitrary 
region of said primary storage device; 

third process of linking the procedure, function or sub- 
routine transformed into said first process and the 
object program obtained in said second process; 

fourth process of collecting dynamic information con- 
sisted of information indicative of number of times that 
the procedure, function or sub-routine is actually 
called, and information indicative of call relationship 15 
between said procedure, function or sub-routine with 
executing the object program obtained through said 
third process; 

fifth process of arranging said procedure, function or 20 
sub -routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

sixth process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said first process and the object program 
obtained in said second process, on the basis of said 
arrangement information. 

21. A computer readable memory as set forth in claim 20, 
wherein in said second process, 

said procedures, functions or sub-routines are divided into 35 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

22. A computer readable memory storing a language 
processing program for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, said language processing program comprising: 

first process of transforming said source program into a 
temporary object program, and in conjunction 
therewith, upon executing said temporary object 
program, inserting a code for counting number of times 50 
that said procedure, function or sub-routine is actually 
called; 
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second process of linking one of the procedure, function 
or sub-routine that defined by a user in said source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in said 
programming language and that preliminarily prepared 
in a form of instruction code, with said temporary 
object program obtained through said first process; 

third process of collecting dynamic information consisted 
of information indicative of number of times that the 
procedure, function or sub-routine is actually called, 
and information indicative of call relationship between 
said procedure, function or sub-routine with executing 
the object program obtained through said second pro- 
cess; 

fourth process of arranging said procedure, function or 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

fifth process of transforming at least part of one defined by 
a user in said source program, one defined and 
inspected by the user, one preliminarily prepared in a 
processing system in said programming language and 
one preliminarily prepared in a form of instruction code 
among the procedure, function or sub-routine to be 
used in said source program into a form storable in an 
arbitrary storage region of said primary storage region, 
in which said object program is stored as actually used 
in said data processing system; 

sixth process, after transforming said source program into 
said object program, concerning said object program, 
transforming procedure, function or sub-routine 
defined by a user in said source program into a form 
storable in arbitrary region of said primary storage 
device; 

seventh process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said fifth process and the object program 
obtained in said sixth process, on the basis of said 
arrangement information. 

23. A computer readable memory as set forth in claim 22, 
wherein in said fourth process, 

said procedures, functions or sub -routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 
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